Add a compose ID parser, tweak short and version rules#83
Add a compose ID parser, tweak short and version rules#83AdamWill wants to merge 1 commit intorelease-engineering:masterfrom
Conversation
This adds a `parse_compose_id` function to `composeinfo`. It can parse most compose IDs back into their component parts. As part of this, we tweak the rules for short names and versions a little. The `RELEASE_SHORT_RE` regex now allows upper-case characters, as Fedora has been using these in short names for some time (only `create_release_id` actually validated short names, up until now, which is why we hadn't noticed this inconsistency). The `RELEASE_VERSION_RE` regex now does *not* allow absolutely any string that doesn't start with a digit to be used as a version; the only string version allowed is 'Rawhide' (this is the only one that exists in the wild so far as I know). This is to make it possible to write a sane parser at all. We also fix `get_date_type_respin` to work with the new `ci` type. We could possibly go further than this and disallow the use of dashes in short names; if we did that, parsing compose IDs would become much less difficult and slightly more reliable. But it would require us to either rename or 'grandfather in' Fedora's existing short names with dashes in them, which are Fedora-Atomic, Fedora-Docker and Fedora-Cloud. It would be reasonably easy to write a parser and regex which special case those exact names but disallow any other with a dash. Signed-off-by: Adam Williamson <awilliam@redhat.com>
|
Note: I considered having the parser return a namedtuple rather than a dict. I slightly dislike dict as a return type for this kind of thing because it's easy to forget that dicts are mutable and the pitfalls of dealing with mutable objects. But I thought I'd go with the core type to start with. If you like the idea of a namedtuple, yell. Note that namedtuples appeared in Python 2.6, so they're available for RHEL 6 but not RHEL 5; dunno if that's a problem. Please suggest any other compose IDs to test the parser against, this set includes all the ones from the Please also let me know what you think of the 'upper-case characters in short names' issue, and the possibility of disallowing dashes in short names. |
|
Ping? Anything? |
|
Think again about this, it seems like a lot of relatively fragile code. If I recall correctly, your use case is consuming data from Fedmsg and you want to avoid having to download the metadata files, right? If so, what about changing the message and actually putting all this information directly into the message? |
|
Basically my case is we've been doing this for a while and it's actually easier to write this code than go back through all the various things we do with composes and unpick the ones that find composes by compose ID. If you don't accept the code in productmd I'm more likely to just keep it in fedfind than change all of those things. |
This adds a
parse_compose_idfunction tocomposeinfo. Itcan parse most compose IDs back into their component parts. As
part of this, we tweak the rules for short names and versions
a little. The
RELEASE_SHORT_REregex now allows upper-casecharacters, as Fedora has been using these in short names for
some time (only
create_release_idactually validated shortnames, up until now, which is why we hadn't noticed this
inconsistency). The
RELEASE_VERSION_REregex now does notallow absolutely any string that doesn't start with a digit to
be used as a version; the only string version allowed is
'Rawhide' (this is the only one that exists in the wild so far
as I know). This is to make it possible to write a sane parser
at all.
We also fix
get_date_type_respinto work with the newcitype.
We could possibly go further than this and disallow the use of
dashes in short names; if we did that, parsing compose IDs
would become much less difficult and slightly more reliable.
But it would require us to either rename or 'grandfather in'
Fedora's existing short names with dashes in them, which are
Fedora-Atomic, Fedora-Docker and Fedora-Cloud. It would be
reasonably easy to write a parser and regex which special case
those exact names but disallow any other with a dash.
Signed-off-by: Adam Williamson awilliam@redhat.com